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Abstract  -  For  many  applications  of  radar  and  sen¬ 
sor  based  filtering,  simulations  can  not  represent  the 
sole  estimate  of  performance,  provide  points  where 
threats  become  engageable,  or  determine  when  to  use 
weapons  ’  platform  based  sensors  effectively  in  an  en¬ 
gagement,  etc...  No  significant  advances  have  been 
proposed  to  analytically  characterize  performance  or 
at  least  bound  performance  of  the  Kalman  filter  other 
than  the  use  of  simple  two  or  three  state  constant 
gain  filters.  This  paper  suggests  methods  for  char¬ 
acterizing  filter  algorithms  that  can  be  used  to  bound 
the  advanced  tracking  algorithms  that  are  used  in  a 
single  sensor  or  muli-sensor  environment. 
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1  Introduction 

Significant  advances  have  been  made  in  the  last  forty 
years  in  target  tracking  that  have  enhanced  radar 
based  tracking  well  beyond  Kalman’s  publication  of 
the  two  papers  that  define  Kalman  filtering.  At  the 
same  time  that  significant  advancements  techniques 
have  been  proposed  by  theoreticians  to  deal  with  the 
real  problems  of  target  tracking  that  the  Kalman  filter 
has  proven  inadequate  to  solve.  However,  no  significant 
advances  have  been  proposed  to  analytically  character¬ 
ize  filter  performance  or  at  least  bound  performance  of 
these  filters  other  than  the  usage  of  simple  two  and 
three  state  constant  gain  filters.  This  is  a  problem 
for  many  applications  of  radar  and  sensor  based  fil¬ 
tering,  because  system  performance  characterization 
cannot  depend  on  simulations  to  estimate  performance 
on  the  fly,  so  to  speak,  or  to  provide  points  when 
threats  become  engageagble,  or  to  determine  when  to 
use  weapon’s  platform  based  sensors  effectively  in  an 
engagement,  etc. 

The  cost  of  a  statistically  significant  sampling  the 
statistical  universe  in  which  a  weapons-sensor  plat¬ 
forms  exist  is  prohibitive  due  to  the  expanse  of  simula¬ 
tion  time  required  to  achieve  adequate  sampling  which 
has  significant  financial  cost.  Thus,  there  is  always 
a  need  for  analytical  approaches  to  characterize  the 
interaction  between  filters  and  the  sensor  operational 


environment.  The  reasons  for  insistence  on  analytical 
methods  over  pure  simulation  are  many,  but  it  is  suf¬ 
ficient  to  note  that  without  useful  rules  of  thumb,  ap¬ 
prentice  engineers  produce  filter  design  that  are  clearly 
nonsense  to  those  practitioners  who  have  gained  expe¬ 
rience  over  many  years  with  functioning  filters  that 
are  correctly  designed.  These  rules  of  thumb  become 
even  more  important  in  the  multi-sensor  fusion  envi¬ 
ronment,  where  the  level  of  experience  and  history  of 
performance  of  the  algorithms  is  limited.  The  general 
characteristics  that  bound  the  performance  of  the  fu¬ 
sion  algorithms  are  small  compared  to  the  statistical 
universe  that  the  algorithm  will  be  operating  in. 

Thus,  it  would  be  valuable  to  be  able  to  character¬ 
ize  tracking  algorithm  performance  in  the  single  sensor 
and  multi-sensor  environment  so  that  a  useful  charac¬ 
terization  of  performance  is  available.  This  provides 
the  ability  of  a  user  to  perform  sanity  checks  of  fil¬ 
ter  outputs  for  incorporation  in  the  control  loop  of  the 
sensor-weapons  platform  and  examine  weapon  effec¬ 
tiveness.  This  paper  suggests  some  methods  for  char¬ 
acterizing  filter  algorithms  that  can  be  used  to  bound 
the  advanced  tracking  algorithms  that  are  used  in  a 
both  a  single  sensor  multi-sensor  environments.  In  ad¬ 
dition,  this  paper  attempts  to  throw  down  the  gauntlet 
to  other  practitioners  to  present  methods  of  their  own 
on  some  future  occasion. 


2  Constant  Gain  Filters 


The  tracking  equations  for  the  a  —  filter  (the  neces¬ 
sary  background  is  found  in  the  books  by  Bar-Shalom 
[2]  and  Blackman  [4])  consist  of  two  parts:  prediction 
equations,  which  are  given  by 


Xp{k)  =  AXs{k  -  1) 


where 


Xp{k)  = 


Xp{k) 

Vp{k) 


A  = 


1  T 
0  1 


and  smoothing  equations,  which  are  given 


(1) 

(2) 

(3) 


Xs{k)  =  FXs{k-l)+Gx^{k)  (4) 
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where 


(13) 


Xs{k)  = 


T 


Xs{k) 

Vs{k)  _  ’ 

(1  -  a)T 
1-/3 


(5) 

(6) 


and 


G  = 


a 

I 


(7) 


L  T  J 

•  Xs{k)  =  smoothed  position  at  the  k-th  interval 

•  Xp{k)  =  predicted  position  at  the  k-th  interval 

•  Xm{k)  =  measured  position  at  the  k-th  interval 

•  Vs{k)  =  smoothed  velocity  at  the  k-th  interval 

•  Vp{k)  =  predicted  velocity  at  the  k-th  interval 

•  T  =  radar  update  interval  or  period 

•  a,  P  =  filter  weighing  coefficients 


The  question  of  the  selection  of  filter  coefficient  values 
and  the  relationship  between  the  coefficients  used  by 
tracking  filters  to  determine  pointing  commands  for  a 
tracking  radar  dates  back  at  least  as  far  as  work  by 
Sklansky  [16].  Sklansky  proposed  performance  mea¬ 
sures  including  stability,  transient  response,  noise  and 
maneuver  error  as  a  function  of  the  dynamic  parame¬ 
ters  a  and  /3.  All  of  the  work  was  based  on  a  frequency 
domain  or  z-transform  analysis.  Benedict-Bordner[3] 
proposed  a  relationship  between  a  and  /3  based  on  a 
pole-matching  technique  that  combined  transient  per¬ 
formance  and  noise  reduction  capability.  Analysis  per¬ 
formed  by  Simpson  [15],  Neal,  and  Benedict  [14]  ex¬ 
tended  this  analysis  to  the  a  — /3  — 7  filter.  Later,  much 
of  this  work  was  summarized  in  the  open  literature  by 
Kalata  [11].  A  summary  of  subsequent  developments 
in  the  literature  to  1992  is  found  in  Kalata  [11]  with 
some  additional  work  since  then  found  in  Gray  [7]  , 
and  in  the  open  literature. 

There  are  several  different  dynamics  models  that 
lead  to  an  a  —  /3  filters  with  different  statistical  and 
performance  attributes.  While  all  filters  have  the  same 
noise  reduction  ratios  for  position  and  velocity,  they 
have  different  transient  responses  or  bias  depending 
on  which  threat  model  one  uses.  In  matrix  form,  the 
predicted  update  is  (note  this  model  lumps  maneuver¬ 
ability  uncertainty  in  to  the  velocity  component) 

X{k  +  l)  =  ^X{k)  +  'i>w{k),  (8) 

while  the  measurement  model  in  matrix  form  is 


where 


z{k  -I-  1)  =  HX{k  -I-  1)  -b  n{k  +  1); 


= 


0 

1  ’ 


$  = 


1  T 
0  1  ’ 


H  =[l  0  ]  , 


(9) 

(10) 

(11) 

(12) 


X{k) 


x{k) 
v{k)  ’ 


which  is  termed  Model  1  for  the  form  of  4/.  An¬ 
other  approach  to  modeling  maneuver  uncertainty  is 
to  incorporate  it  into  both  the  position  and  velocity 
prediction  components:  in  matrix  form  (Model  2) 


X{k  +  l)  =  <PX{k)  +  'S2Wa{k)  (14) 


where 


4'2  = 

■  rV2  ■ 

T 

$  = 

'IT' 

0  1  J  ’ 

x{k)  = 

1 - 1 

(15) 

(16) 
(17) 


and  H  is  the  same  as  before.  In  [8] ,  a  method  is  devel¬ 
oped  for  solving  the  Lyapunov  equations  for  these  type 
of  models.  For  the  steady  state  performance,  the  noise 
reduction  ratios  are  the  position  (K^),  velocity  {Ky), 
and  position- velocity  cross  term  {K^v)  which  are  given 
as 


Ky  = 


2a2  +  j3{2  -  3a) 


Ky  = 


D 

2/3^ 

^2’ 


(18) 

(19) 


Ky 


/?(2a  -  /3) 
DT 


(20) 


where  D  =  a(4  —  2a  —  /3).  For  Model  2,  the  noise 
reduction  ratio  for  the  position  and  the  noise  reduc¬ 
tion  ratio  for  the  velocity  are  given  as  before  while  the 
transient  reduction  ratios  are 


Tl{Q) 


(1  -  afT^ 


(21) 


and 


r2(o) 


T2  r  2a2  -  3a/3  -b  2/3 
4  a/3 


(22) 


Note,  this  method  applies  to  any  constant  gain  filter,  so 
higher  state  filters  can  be  characterized  as  well  by  solv¬ 
ing  the  Lyapunov  equation  analytically.  Thus,  noise 
covariance,  both  steady  state  noise  reduction  and  tran¬ 
sient  response  have  analytical  models  which  character¬ 
ize  performance  by  this  method.  The  various  transient 
noise  reductions  are  useful  for  estimating  filter  per¬ 
formance  when  un-modeled  dynamic  behavior  occurs. 
When  combined  with  the  noise  reduction  ratios,  these 
provide  performance  envelopes  that  bound  filter  per¬ 
formance.  In  addition  to  bounding  performance,  the 
Jacobian  of  the  noise  reduction  and  transient  perfor¬ 
mance  ratios  allow  one  to  determine  one  of  four  possi¬ 
ble  relationship  (Benedict-Bordner,  Kalata  [11],  Con¬ 
tinuous  White  Noise,  and  an  unnamed  one)  between 
the  filter  coefficients,  e.g.  a  =  a  (/3)  as  discussed  in  [9]. 

The  other  aspect  of  filter  design  is  selection  of  the 
filter  gains.  This  can  be  accomplished  by  introduc¬ 
ing  a  cost  function  methodology.  A  function  of  mean 
squared  error  is  taken  to  be  the  arbitrator  of  perfor¬ 
mance.  While  mean  square  is  widely  accepted,  other 


normed  distance  measure  could  be  used.  A  mean 
squared  cost  function  can  be  formed  by  a  combination 
of  noise  reduction  and  filter  response  to  an  un-modeled 
term  [9].  A  normalized  cost  function  (Jn)  (normalized 
means  the  mean  squared  error  equals  the  normalized 
cost  times  the  normalization  constant  N  which  for  the 
steady  state  velocity  cost  function  is  In  gen¬ 

eral,  all  cost  functions  are  of  the  form 


where  the  velocity  lag  coefficient  is  dehned  as  r.  The 
acceleration  bias  of  the  smoothed  position  response  hl- 
ter  is 

Lx  =  o-oT'^Ix,  (34) 

where  the  position  lag  coefhcient  is  defined  as 


1-a  _  1  1 


(35) 


JN{a,l3)  =  f{a,l3)+g{a,/3)A‘^,  (23) 

where  /,  g  are  functions  of  a  and  j3  that  depend  on 
which  Model  1  is  being  considered  and  A  is  (oq  is 
the  un-modeled  acceleration  and  a  is  the  measurement 
noise) 

A2  =  (24) 

where  ag  is  the  un-modeled  acceleration.  A  normalized 
cost  function  based  on  velocity  noise  reduction  ratio 
and  the  Model  2  velocity  transient  noise  reduction 
ratio  is 

=  (25) 

where 

A2^(a,/3)=  (26) 


Note  both  (3  and  a  can  always  be  written  in  terms  of 
lag  coefficients  as 


/3 


2 

2  {lx  -l-  t)  -|-  1 


(36) 


and 

(2r  -H  1) 

Of  ^  - 

2  {lx  3-  t)  +  1 


(37) 


so  ^  ^  =  T  which  does  not  depend  on  which  relation¬ 

ship  one  chooses  between  a  and  /3.  The  noise  reduction 
ratios  expressed  in  the  lag  form  can  be  expressed  as 


and 


Ly  {lx  ?  ^ ) 

Lx  {Ix^  ) 


2 

{2lx  +  t){2t  +  1) 
(2t^  2lx  T  t) 

{2lx  +  t){2t  +  1) 


For  the  Model  2  transient  terms  we  have 


(38) 


(39) 


A  cost  function  based  on  position  noise  reduction  ratio 
and  the  Model  2  position  transient  noise  reduction 
ratio  is 


J^{2,a,l3) 


2a^  -  I3{3a  -  2) 
a{4  —  2a  —  (3) 


+  ^2  (“i  /5)A^, 


(27) 


where 


X^{a,P) 


2a^  -  3a/3  -b  2/3 
a/3 


(28) 


Another  combination  for  a  possible  cost  functions 
based  on  the  steady  state  lags  can  be  achieved  by  com¬ 
bining  the  position  noise  reduction  ratio  and  position 
lag  which  gives, 


J^{S,a,p) 


2a^  -  /3(3a  -  2)  ,  , 

a(4-2a-/3)  " 


(29) 


where 

(30) 

while  the  steady  state  normalized  velocity  lag  cost 
function  is  given  by: 

"^(^---^)=a(4-2a-/3)+^-^^-  (31) 


where 


a  1 
p~2- 


(32) 


The  acceleration  bias  of  the  smoothed  velocity  re¬ 
sponse  of  the  hlter  or  bias  is  given  by: 


(33) 


Ar(a,/3)  = 


2  (2t^  -|-  2lx  T  '^) 


and 


X^{a,P)  = 


(2t  +  1) 

(2t  +  1)- 


(40) 


(41) 


Examining  the  filter  coefficient  relationships  in  terms 
of  the  velocity  lag  coefficient  r,  gives  several  in¬ 
teresting  results.  For  each  of  the  filter  coefhcient 
relationships, we  can  express  the  hlter  coefhcients  in 
terms  of  the  lag  r  coefhcient  form  (e.g.  the  form  that 
depends  just  on  r  and  not  lx)-  The  velocity  lag  is  the 
same  for  all  three  coefhcient  relationships,  but  Ix{t)  is 
different.  Also,  only  for  the  Kalata  relationship  does 
Ix{t)  0,  the  others  don’t.  Also  if  we  dehne  r'  =Tt, 

T— >0 

then  if  we  replace  r  with  J  and  then  replace  T  with 
an  arbitrary  update  interval  and  still  have  [13] 


afc  =  1  - 


(Tfe+r)’ 


(42) 


and 


B 

{Tk+rY 


(43) 


Thus  the  Kalata  hlter  coefhcient  relationship  is  pre¬ 
served  for  arbitrary  aperiodic  updates  ,  which  main¬ 
tains  the  nominal  hlter  gains  and  preserves  the  hlter 
coefhcient  relationship.  The  lag  characteristics  sug¬ 
gest  a  difference  between  the  properties  of  the  hlter 
coefhcient  relationships  based  on  position  lag,  which 
suggests  that  the  Kalata  relationship  is  preferred  in 
our  multi-platform  application  because  of  communica¬ 
tion  and  computation  delays  that  are  mitigated  for  the 


Ly  —  Clg'3  'ly  —  Clg'3 't” 


Kalata  relationship  only.  Once  a  particular  filter  coef¬ 
ficient  relationship  is  chosen  a  =  a(f3),  the  cost  func¬ 
tion  can  be  expressed  in  terms  of  t.  The  cost  function 
can  then  be  minimized  to  give  t  =  r  (A)  and  hence 
a  =  a  (A). 

Thus  hlter  performance  can  be  expressed  in  a 
tractable  form  based  on  one  design  parameter  A.  Since 
all  of  this  can  be  accomplished  using  a  minimum  mean 
square  approach,  a  very  good  characterization  of  a 
single  filter  boundaries  exists  based  on  an  analytical 
characterization,  provided  a  good  estimate  of  A  exists. 
Thus,  the  ability  to  match  any  single  model  Kalman 
filter  performance  to  within  a  few  percent  exists  (Dale 
Blair — private  communication).  Early  work  on  extend¬ 
ing  these  concepts  to  three  state  hlters  are  found  in 
[15]  and  [14];  but  the  methods  presented  here  illus¬ 
trated  by  the  a  —  j3  filter  are  easily  extended  to  any 
multiple-state  constant  gain  hlter.  Furthermore,  the 
label  constant  gain  for  the  hlter  is  a  bit  miss-leading. 
Since  a  =  a  (A) ,  and  we  can  always  update  an  esti¬ 
mate  of  A  as  frequently  as  necessary  up  to  including 
the  sensor  update  rate  to  maintain  desired  track  ac¬ 
curacy.  So  designing  a  hlter  this  way  provides  a  tight 
bound  of  performance  relative  to  the  single  stage  hl¬ 
ter  one  is  trying  to  arrive  at  a  bound  performance  for. 
For  multiple  model  hlters,  the  situation  becomes  more 
complicated. 

For  a  multiple  model  hlter,  such  as  the  Interacting 
Multiple  Models  (IMM),  the  analytical  approach  is  also 
helpful.  Many  tracking  simulations  are  run  using  the 
IMM  with  two  models:  constant  velocity  and  nearly 
constant  velocity,  or  nearly  constant  velocity  and  con¬ 
stant  acceleration.  It  is  difficult  to  decide  which  tran¬ 
sition  matrix  is  optimal,  but  it  is  clear  that  certain 
choices  are  dehnitely  sub-optimal.  Matrices  near  those 
suggested  by  the  sojourn  time  calculation  behave  well. 
However,  while  perturbing  the  matrix  a  small  amount 
away  from  the  suggested  values  does  result  in  a  change 
in  performance,  it  is  difficult  to  decide  whether  the 
change  is  for  the  better,  since  performance  is  not  mea¬ 
sured  by  one  number,  but  by  a  balance  of  competing 
interests.  The  results  obtained  imply  that  having  three 
models  is  not  necessary  except  in  unusual  cases.  Also, 
the  values  of  the  off-diagonal  elements  do  not  matter, 
except  for  the  requirement  that  the  rows  sum  to  one. 
Recall  the  IMM  algorithm  consists  of  six  steps: 


ces: 

x°^{k-l\k-l)  =  '^x\k-l\k-l)ij,,y{k-l\k-l) 

(45) 

P°^k-l\k-l)  = 

J2f,^^^(k-l\k-l)-{P\k-l\k-l)+ 

I 

[x\k  -  l\k  -  I)  -  ^^{k  -  l\k  -  I)]  X 
[x\k-l\k-l)-x'^^{k-l\k-l)\'} 

Each  hlter  produces  an  estimate  of  position,  and 
if  the  hlter  were  operating  alone,  that  estimate 
would  be  used  during  the  next  iteration  as  an  ini¬ 
tial  value.  In  the  IMM  algorithm,  these  estimates 
are  mixed  together,  so  that  the  input  to  hlter  j  is 
the  estimate  of  the  position  most  likely  at  time 
fc  —  I,  given  that  model  j  is  in  effect  at  time  k. 

3.  Apply  filters: 

Now  the  estimates  and  covariance  matrices  calcu¬ 
lated  in  the  step  before,  and  the  observation  taken 
at  time  k,  are  used  as  input  to  the  Kalman  filters. 
Each  filter  behaves  normally  at  this  point. 

4.  Compute  model  likelihood  functions: 

A,{k)=P{z{k)\M,{k),Z'^-^)  (46) 

The  likelihood  function  will  be  used  to  update  the 
probabilities  of  the  various  models.  The  likeli¬ 
hood  of  model  j  at  time  k  is  defined  to  be  the 
probability  of  observing  the  value  that  was  actu¬ 
ally  observed,  given  the  previous  history  and  the 
assumption  that  model  j  was  in  effect  at  time  k. 

5.  Compute  model  probabilities: 

-  1)  (47) 

i 

This  calculation  represents  using  the  observation 
at  time  k  to  update  of  the  probability  of  each 
model  being  in  effect.  The  value  c  is  a  normalizing 
constant. 


1.  Weights  update: 


6.  Compute  updated  state  estimate  and  co- 
variance  matrix: 


Hi\j{k-l\k-l)  = —p^jiJ.^{k-l)  (44) 


x{k\k)  =Y^x\k\k)pi{k),  (48) 

i 


This  is  a  calculation  of  the  probability  that  model 
i  was  in  effect  at  time  fc  —  I,  given  that  model 
j  is  in  effect  at  time  k.  In  the  above  equation, 
the  value  Cj  is  a  normalizing  constant,  Pij  is  the 
probability  of  a  transition  from  model  i  to  model 
j,  and  Piik  —  I)  is  the  probability  that  model  i 
was  in  effect  at  time  k  —  1. 

2.  Mix  input  estimates  and  covariance  matri- 


P{k\k) 


/ij(A:|fc){P*(fc|A:) -I-  x^ {k\k)  —  x(k\k)\  x 


x^{k\k)  —  x{k\k)\' }. 


(49) 


This  calculation  is  only  necessary  if  an  output  of  a 
composite  estimate  and  covariance  matrix  is  desired. 
These  values  are  not  used  elsewhere  in  the  algorithm. 


The  Kalman  filter  steps  can  be  replaced  with  two 
different  filter  models  using  different  analytical  models 
based  on  either  transient  or  steady  state  performance. 
For  a  three  model  case,  one  would  use  a  three  state  an¬ 
alytical  filter.  Noting  that  for  the  IMM  algorithm,  the 
matrix  [pij]  calculates  the  model  probabilities  for  the 
final  output.  The  existence  of  this  matrix  stems  from 
a  representation  of  the  target’s  maneuvers  as  transi¬ 
tions  from  state  to  state  in  a  Markov  chain.  Thus,  it 
makes  sense  to  apply  Markov  theory  in  the  analysis 
of  this  problem.  In  a  Markov  chain,  is  the  sojourn 
time,  that  is,  the  expected  amount  of  time  continuously 
spent  in  state  i,  and  is  given  by 


This  follows  by  noting  [5]: 


Given  a  number  of  asynchronous  valid  tracks, 
{Xi{tki\tki),Piitki\tki)},  i  =  l,2,...,n,  that  arrive  at 
the  fusion  center  during  the  time  interval  [tk-i,tk], 
find  the  best  estimate  in  the  minimum  mean  square 
sense  of  the  system  state  at  time  tk  when  it  is  com¬ 
puted  according  to  the  fusion  rule. 

n 

Xf{tk\tk)  =''^Li{tki)Xi{tki\tki)  (52) 

i=l 

where  the  Li ’s  are  unknown  weighting  matrices  to  be 
determined.  The  error  of  the  fused  track  at  time  tk  = 
kTf  is 

Xf{tk\tk)=Xfitk\tk)-X{tk)  (53) 

Using  the  system  dynamics  and  the  fusion  rule  gives 


U(Time  in  i\  State  at  time  0  is  z)  =  - - .  (51) 

1  -  Pa 

Thus,  the  IMM  filter  performance  can  be  viewed  as 
a  weighted  combination  of  individual  analytical  filters 
with  the  weights  based  on  the  sojourn  time  of  a  par¬ 
ticular  filter.  While  not  tested  yet,  it  is  expected  that 
this  should  yield  fairly  accurate  estimate  of  IMM  per¬ 
formance  boundaries  and  will  be  the  subject  of  a  future 
publication. 


Xf{tk\tk) 


LiXi{tki \tki)  + 

i=l 


X 


xitk)-Y,L^Htk„tk)Wi^!:^ 

i^l 


(54) 


If  all  the  local  filters  are  unbiased,  then 


^[Li$(4,,4)  - 1  ] 


=  0, 


(55) 


3  Fusion  Algorithms 


so  the  fused  track  is  unbiased  if 


In  order  to  improve  the  quality  of  the  state  estimate, 
local  tracks  or  data  can  be  communicated  from  the  sen¬ 
sors’  platform  to  a  central  site  for  the  purpose  of  esti¬ 
mation  fusion.  The  results  of  this  paper  apply  to  an 
arbitrary  communication  rate.  The  sensor  platforms 
can  communicate  their  data/track  after  every  update, 
after  a  given  number  of  updates,  or  after  every  time 
period.  Communication  delays  exist  between  the  sen¬ 
sor  platforms  and  the  fusion  center.  Let  t  =  tk-i  be 
the  last  time  the  fusion  center  performed  a  track  fusion 
and  let  tk  =  tk-i  +  Tf  be  the  time  of  the  next  fusion 
time.  The  period  Tf  is  an  adaptive  design  parameter. 
It  can  be  as  small  as  the  time  it  takes  to  receive  two 
tracks  from  two  different  sensor  platforms  or  as  long 
as  the  time  it  takes  to  receive  as  many  tracks  as  the 
total  number  of  sensors  in  the  network.  The  value  of 
Tf  mainly  depends  on  the  priority  given  to  the  track 
fusion  task  which  depend  on  the  application  at  hand. 
The  number  n  of  data/tracks  to  be  fused  during  the 
interval  \tk-i,  tk\  is  an  arbitrary  number  between  2 
and  m,  where  m  is  the  number  of  senors  used.  Be¬ 
cause  of  the  difference  in  communication  delays,  some 
of  the  validated  tracks  may  arrive  out-of-  sequence.  To 
deal  with  the  out-of-  sequence  tracks  without  special 
processing,  all  the  tracks  that  arrive  at  the  fusion  cen¬ 
ter  during  the  time  [tk-i,tk]  after  removing  redundant 
tracks,  are  fused  simultaneously.  The  asynchronous 
multi-sensor  track  fusion  problem  can  be  stated  as  fol¬ 
lows: 


Y,L^<^>{tk,,tk)  =  I  (56) 

i=l 

This  represents  the  first  constraint  on  the  choice  of  the 
weighting  matrices.  Therefore, 

n—1 

Ln  =  ^itk,tkj Li^{tki,tkJ.  (57) 

i=l 

The  error  covariance  matrix  of  the  fused  track 
can  be  defined  as 

Pf{tk\tk)  =  E{Xf{tk\tk)Xf{tk\tk)  }  (58) 

which  allows  determination  of  the  weights  Li  which  de¬ 
fine  the  optimal  asynchronous  track  fusion  filter.  Fur¬ 
ther  details  are  found  in  [1] 

Theorem  1:  The  error  covariance  matrix  of  the 
fused  track  using  the  fusion  rule  is  given  by 

n—1 n—1  n—1  n—1 

Pf{k\k)  =  LjMjjLb-i-'y^^  LiNi  +  ''^^  NiL^-\-Mn 

i=l  9  =  1  i=l  9=1 

(59) 

Proof:  See  [1]. 

Theorem2  :  The  minimum  mean  square  solution  of 
the  asynchronous  track  fusion  problem  using  the  fusion 
rule  is  given  by 

n 

Xf{tk\tk)  =  LjXjftk, \tki) 

9=1 


(60) 


Pf{tk\tk)  =  LML  +LN  +  n'l  +Mn  (61) 
where 

[Li  L2  ...  i„-i]  =  -N'm-^  (62) 

n—1 

=  (63) 

i=l 

Proof:  See  [1]. 

The  reason  for  this  example  is  that  it  is  indicative 
of  all  fusion  algorithms,  they  are  variations  on  a  sim¬ 
ilar  theme.  With  the  proper  redefinitions  of  matrices, 
both  track  and  data  fusion  amount  to  the  same  thing: 
weighted  (positive  semi-definite  and  the  sum  is 
normalized  to  one)  combinations  of  data.  The 
results  can  be  extremely  complicated,  difficult  to  un¬ 
derstand,  difficult  to  predict  in  terms  of  performance, 
and  difficult  to  determine  the  algorithm’s  underlying 
correctness,  but  they  have  the  same  underlying  math¬ 
ematical  form  when  understood  properly.  Given  this 
observation,  an  alternative  suggests  itself  as  a  means  of 
understanding  these  types  algorithms  based  on  maxi¬ 
mum  entropy  analysis  to  interpret  the  probabilities  as 
the  solution  to  a  maximum  entropy  problem.  There  are 
two  approaches  to  developing  this  methodology  which 
will  now  be  review. 

4  Maximum  Entropy  Procedure 
As  A  Means  Of  Understand- 


maximize  is  the  entropy,  so  it  is  adjoined  to  the  free 
parameters  (Lagrange  undetermined  multipliers)  times 
the  equations  for  the  constraints  to  form  a  Lagrangian, 
which  for  entropy  maximization  is  (take  fc  =  1) 

L  =  -  (p|lnp)  -  Ao  ((lb)  -  1)  (65) 

m 

-'^Xr  [{p\gr  {\x))}  -  ar]  . 

r—1 

To  minimize  with  respect  to  the  probabilities,  one  com¬ 
putes  the  particular  state  (pje,)  such  that  g^^^is  a 
minimum: 


dL 

d{p\ei) 


m 

=  0  =  In  Pi  -|-  1  -|-  Aq  -f  XrPr  (Xi)  . 

r—1 


(66) 


Solving  for  the  state’s  probability  {p\ei)  =  pi  gives  the 
probability  assignment  as 


(pbi)  = 


(67) 


Substituting  probabilities  into  Condition  1  gives  first 
Lagrange  undetermined  multiplier  as 


g(Ao+i)  ^  ^g(-Er=i^-s-G.)) 

n 

=  (68) 

i-1 


ing  Fusion  Schemes 

The  degree  of  uncertainty  (which  is  equivalent  to  the 
surprise  value)  in  the  information  is  defined  as  the  en¬ 
tropy: 

n 

S[\p);n\  ==  -k{p\\np)  =  -k^pilnpi;  (64) 

Most  applications  do  not  permit  one  to  measure  the 
probabilities,  pi,  associated  with  a  physical  variable, 
f{xi),  instead  the  expected  value,  {f{x)) ,  is  measured. 
Probabilities  are  connected  to  expected  values  by  us¬ 
ing  the  formula  for  {f{x)).  The  goal  of  a  maximum  en¬ 
tropy  procedure  is  to  find  an  assignment  to  the  proba¬ 
bility  vector  b)  subject  to  the  following  two  conditions: 
Condition  1:  b)  is  normalized:  (lb)  =  l-Condition 
2:  Known  assignment  of  the  means  {  r  =  1,2,3...to): 

{P\9r  (b)))  =  {9r{\x})}  =  Or- 

These  two  conditions  provide  two  equations,  so 
(n  —  2)  more  are  needed.  One  approach  is  to  find 
an  assignment  of  probabilities  that  maximizes  the  en¬ 
tropy.  These  two  condition  plus  the  principle  [10]: 
"  The  distribution  \p)  that  maximizes  the  uneertainty 
in  the  expeeted  value  subject  to  the  constraint  of  the 
available  information”  provides  such  an  assignment. 
The  method  of  Lagrange  undetermined  multipliers  is 
a  standard  method  for  solving  an  optimization  prob¬ 
lem  subject  to  constraints.  The  function  one  wants  to 


Other  Lagrange  undetermined  multipliers  are  ob¬ 
tained  by  substituting  probability  assignment  into 
Condition  2  in  the  Appendix, 

71 

{p\9r  (b)))  =  X!  ,  (69) 

i=l 

which  solves  the  problem  of  assigning  the  probabilities. 
Define  the  partition  function  as  Z  as 

g(Ao-ei)  ^  (70) 

The  Hamiltonian  H  is  defined  as 

m 

Hi  =  ^  (  Xj-Qr  (Xi)  ,  ("fl) 

r—1 

=  (Xl9(xi)} .  (72) 

The  probability  assignment  becomes 
/  I  \  exp(-i7*) 

Wi)  =Pi  = - ,  (73) 

which  is  a  familiar  expression  to  physicists. 

A  general  model  for  any  process  that  can  be  viewed 
as  a  weighted  graph  (communication  or  fusion  network 
for  example),  e.g.  the  connected  graph  has  a  weight 
Wij  associated  with  each  arc  {i,j),  we  can  associate  a 
number  that  can  be  defined  as  a  probability,  or  as  a 
fusion  rule  for  different  sources  or  as  a  multiple  model 


filter  with  the  data  drawn  from  different  sources,  by 
the  assignment  : 


P  = 

ij 


-  if  (f,j)  is  an  arc 
0  if  {i,j)  is  not  an  arc 


(74) 


So  instead  of  interpreting  \iXi{G)  as  the  Lagrange 
undetermined  multiplier  times  some  property  of  the 
graph,  it  can  be  interpreted  as  interaction  potential 
between  components  i  and  j.  The  Xi  is  interpreted  as 
scaling  parameter  /x,  a  field  coupling  parameter,  or  the 
inverse  temperature.  It  is  possible  to  posit  a  variety  of 
interaction  models,  work  out  their  consequences,  and 
work  out  the  equivalent  probability  distributions.  The 
simplest  form  for  the  Hamiltonian  is  when  the  expected 
number  of  edges  (m)  is  known,  so  H{G)  becomes 


H^{G)=  ^im{G).  (75) 


In  general,  can  specify  interaction  Hamiltonian  as 

H  (84) 

i  <  j 

where  A^-  is  parameter  that  couples  each  edge  together 
and  the  degree  sequence  satisfies  Vi  =  (Jij .  The 
partition  function  is 

Z  =  n(l  +  e-^-)-  (85) 

i<j 

Free  energy  is 

F  =  -  ^  ln(l +  .  (86) 

i  <  j 

Note  probability  of  an  edge  between  edge  i  and  j  is  the 
expected  number  of  the  degree  sequence,  then 


This  model  is  trivial,  so  next  consider  the  simplest 
model  where  the  adjacency  matrix  A  (G)  has  compo¬ 
nents  Qij 


_  J  I  if  f  is  connected  to  j 

(0  if  f  is  not  connected  to  j 

The  number  of  edges  m  is 


(76) 


dF  I 

(I  +  e^o)- 

Underlying  basis  for  the  parameterization  that  gives  us 
Bernoulli  model  with 

P(G)=p™(l-p.,)(S)— .  (88) 


m(G)  =  ^  a.,  (G) ,  (77) 

i  <  j 

therefore  the  Hamiltonian  is  given  by: 

H{G)  =  /X  ^  (G) .  (78) 

i  <  j 

The  partition  function  (PF)  is 

Z^{G)  =  ^exp(-H(G)) 

G 

=  (1 -f  exp  (-^))(=^( .  (79) 

In  general  it  is  possible  to  replace  /Xj  with  /x^,  so 
the  Hamiltonian  is  H{G)  =  ^  ^  iJ,^jaij{G)  while  the 

Partition  function  is 

Zp{G)^l[{l  +  e-^^^).  (80) 

i  <  j 

The  free  energy,  which  is  the  logarithm  of  the  partition 
function,  is 


=  -ln(Z^(G)) 
=  —  In  (l  -I- 


i  <  j 


The  first  moment  is 


1 


d/x  (l-be'^'j) 
while  the  standard  deviation  is 

X2  T?  oMi 


2 

- 


gf-,j 


dp'^  {l  +  e^'Xy 


(81) 

(82) 

(83) 


A  term  CiOi  to  the  Hamiltonian  without  changing 
the  probability  model  for  the  graph  while  allowing  us 
to  relate  (Gj)  to  the  characteristic  of  the  graph  inter¬ 
action.  Thus,  this  work  can  be  extended  to  social  net¬ 
works  [6], [17],  [18],  by  broadening  it  to  include  network 
modeling  of  interest.  Any  interaction  model  that  can 
be  cast  into  the  above  form  is  a  hidden  Markov  model. 
We  also  note  that  any  weighting  scheme  used  in  a  fu¬ 
sion  algorithm  for  a  tracking  filter  can  be  cast  in  the 
form  of  an  interaction  network  model,  so 

1.  In  principle,  any  probability  assignment  model 
can  be  thought  of  as  a  physical  interaction  model, 
so  one  can  bring  to  bear  the  full  power  of  statisti¬ 
cal  physics  on  it. 

2.  The  translation  between  physics  interaction 
Hamiltonian  to  graph  theory  probability  models 
needs  to  be  made  more  transparent. 

3.  This  approach  applies  to  any  type  of  network  that 
can  be  modeled  as  a  graph  with  a  normalized  or 
zero/one  entries  for  the  adjacency  matrix. 

Some  areas  for  further  development  of  the  interac¬ 
tion  model  method  include: 

1.  While  this  field  has  rich  predictive  capability,  the 
question  is  usefulness  when  applied  to  specific 
problem  domains. 

2.  Time  evolving  graphs  have  the  potential  to  be  a 
huge  application  area  with  many  different  types  of 
domain  applications. 

3.  This  technique  can  be  applied  to  flow  networks  by 
using  a  non-symmetric  matrix. 


4.  An  exponential  probability  model  is  equivalent 
to  Markov  Model  for  time  evolution.  Deviations 
from  the  exponential  model  can  be  thought  of  as 
"memory"  in  the  system  under  consideration,  so 
the  question  is  what  role  memory  plays  in  the  sys¬ 
tem  interaction  model. 

In  creating  a  network  that  fuses  weighted  informa¬ 
tion  from  different  sources  we  are  creating  the  equiv¬ 
alent  of  a  physical  interaction  system  that  has  an  un¬ 
derlying  physics  that  we  can  strive  to  understand  and 
exploit.  Thus  a  fusion  algorithm  when  looked  at  this 
way  is  physics  model  based  on  interpreting  our  weigh¬ 
ing  of  the  data  from  different  sources  as  an  underlying 
interaction  model. 

5  Conclusions 

Any  assignment  of  probabilities  can  always  be  viewed 
as  the  solution  to  a  Bayesian  Maximum  Entropy  prob¬ 
lem.  Given  that  is  the  case,  then  the  question  can  be 
asked:  "What  is  the  Maximum  Entropy  problem  that 
an  assignment  of  fusion  weights  is  the  answer  to?"  Eor 
other  examples  not  related  to  this  question,  see  the 
suggestive  paper  by  Kesavan  [12]  Being  able  answer 
this  question  for  various  fusion  algorithms,  would  al¬ 
low  us  classify  them  in  the  same  fashion.  This  would 
lead  to  an  ordering  of  algorithms  in  terms  of  the  com¬ 
plexity  of  the  problem  they  solve.  This  will  be  dealt 
with  in  a  subsequent  paper.  Our  ideal  has  been  to 
suggest  new  means  to  understand  the  strengths  and 
limitations  shared  information  based  on  tracking  data 
provided  from  sensor  networks.  The  goal  is  to  hnd  bet¬ 
ter  means  for  sharing  resources,  so  information  can  be 
allocated  across  all  or  part  of  the  platforms  within  the 
network  based  on  a  maximum  entropy  viewpoint. 
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